DLQ CHASM pure task if valid after execution, add unit test verifications to test framework#10502
DLQ CHASM pure task if valid after execution, add unit test verifications to test framework#10502awln-temporal wants to merge 4 commits into
Conversation
…ions to test framework
yycptt
left a comment
There was a problem hiding this comment.
Overall looks good from chasm framework side. Please still take a look at the comments.
| attrs chasm.TaskAttributes, | ||
| _ *schedulerpb.BackfillerTask, | ||
| ) (bool, error) { | ||
| if attrs.IsImmediate() { |
There was a problem hiding this comment.
does that mean we have an issue today?
There was a problem hiding this comment.
oh nvm, immediate pure tasks will only run once today.
but agree the validation logic should invalidate the task after execution regardless if the pure task is immediate or not.
I will let scheduler crew review this part.
| TaskType string | ||
| TaskTypeID uint32 | ||
| Archetype string | ||
| ArchetypeID ArchetypeID | ||
| ComponentPath []string | ||
| EncodedComponentPath string | ||
| ScheduledTime time.Time | ||
| Destination string | ||
| Immediate bool |
There was a problem hiding this comment.
shall we just embed TaskNotInvalidatedDetails here?
| ScheduledTime time.Time | ||
| Destination string |
| return fmt.Sprintf( | ||
| "CHASM %s task remained valid after successful execution: %s", | ||
| e.TaskKind, | ||
| e.buildLogicalTaskReport(), |
There was a problem hiding this comment.
ideally error messages should be fixed (so that filtering is easy) and any dynamic parts can be tags.
I think here you can either do the logging with additional tags inside ExecutePureTask or define an error details/ error log tags interface on the task processing side and implement it here.
we may also want to log task & component state (truncated to some size limit) as well to help debugging.
What changed?
If pure task is still valid after execution, we throw a new task error and move the task to DLQ to avoid stuck executions. Also adds unit test verifications to the framework.
Why?
Task Validation checks if a task should be executed by running a Predicate on the component state. Pure tasks should not be valid after execution, but as of now, there is no such verifications in place to prevent a task from remaining valid after execution, leading to stuck executions. The logical task remains in the CHASM tree, and no other physical task is generated.
How did you test it?